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Description 

BACKGROUND OF THE INVENTION: 

Field of the Invention: 

[0001] The present invention relates to a microproc- 
essor whiich is capable of processing a variable-length 
instruction set and which decodes a plurality of instruc- 
tions in parallel. 

Description of tlie Prior Art: 

[0002] In any prior-art microprocessor capable of 
processing a variable-length instruction set, the parallel 
decode of instructions is not perfonned. 
[0003] As a known example pertinent to the present 
invention, there is mentioned an instruction decoding 
method stated in a treatise "32-bit microprocessor V80 
wherein the disturbance of a pipeline Is suppressed by 
building in a cache and a branch prediction mechansim, 
etc., thereby to enhance a perfomnance" contained on 
pp. 195 - 206 In NIKKEI ELECTRONICS BOOKS "New- 
Generation Microprocessors RISC, CISC, TRON" pub- 
lished on September 11, 1989. 
[0004] In the known microprocessor, a plurality of in- 
structions are not really decoded in parallel, but an in- 
struction is decoded in two stages, thereby to enhance 
the throughput of decode capability. The first-stage de- 
code circuit of this known microprocessor is called a pre- 
decode unit, which has the function of decomposing a 
variable- length instruction into elements of fixed length. 
The instruction decomposed into the fixed-length ele- 
ments in this manner is once stored in a buffer (register) 
within the pre-decode unit, and it is transferred from the 
pre-decode unit to an instruction decode unit in compli- 
ance with the request of the instruction decode unit. 
[0005] Meanwhile, the official gazette of Japanese 
Patent Application Laid-open No. 244233/1988 disclos- 
es a microprocessor which is intended to shorten the 
decode time period of a variable-length machine lan- 
guage by decoding a plurality of unit instructions in par- 
allel. With the microprocessor, the machine language in- 
structions of 2 bytes are accepted from outside each 
time, and the unit Instruction of the first byte and that of 
the second byte are respectively decoded by a first de- 
coder and a second decoder. A first selector selects one 
decoded result from among a plurality of decoded re- 
sults delivered from the first decoder. A second selector 
selects one decoded result from among a plurality of de- 
coded results delivered from the second decoder, in ac- 
cordance with the decode Infomiation delivered from the 
first selector. The select operation of the first selector is 
detemnined in accordance with the decode information 
delivered from the second selector According to the mi- 
croprocessor thus constructed, the machine language 
instructions of 2 bytes can be decoded in one machine 
cycle, and the decode time period of the variable-length 
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instruction can be shortened. 

[0006] The US-A-3 566 366 discloses a stored pro- 
gram computer in which both half word length and full 
word length instmctlons are employed. A circuit ar- 

5 rangement selectively omits execution of no-operation 
half word length instructions. The second instruction of 
a pair of half word length instructions is decoded during 
execution of the first instruction of the pair and an output 
signal is generated when the second instruction is a no- 

10 operation instruction. 

[0007] In the EP-A-0 354 740, a data processing ap- 
paratus for decoding and executing instructions in a par- 
allel manner in a variable word length instruction fomiat 
is disclosed. A plurality of decoders is used in which, 

'5 while the primary instruction decoder is decoding an In- 
struction, the probability of parallel decoding of the next 
instruction is detected, so that the primary instruction 
decoder and a secondary instruction decoder decode a 
variable word length Instruction and a fixed word length 

20 instruction, respectively, in a parallel manner. 

[0008] The prior art known from the EP-A-0 354 740 
mentioned above fomris the basis for the preamble of 
attached claim 1 . 

25 SUMAAARY OF THE INVENTION: 

[0009] The inventors' study, however, has revealed 
that, with the prior art technique disclosed in the afore- 
mentioned treatises, two problems are involved in case 
30 of raising the speed of the processing of the microproc- 
essor stilt more. 

[0010] The first problem is that, since the Instruction 
decode uses two stages in the pipeline, branch process- 
ing slows down to the corresponding extent. 

35 [0011] That is, in a case where the branch processing 
is started and is followed by fetching and pre-decoding 
a branch destination instruction, a time period expended 
on the branch increases to the amount of one stage 
more than in a microprocessor which requires only one 

40 stage for decode processing. 

[0012] As the second problem, in a case where the 
method of decoding an Instaiction in two stages as in 
the prior-art technique is adopted in a microprocessor 
which executes a plurality of instructions in parallel, the 

45 pre-decode unit governs the perfonnance of the whole 
microprocessor. The reason is that, since the instruction 
to be processed Is in the state of a variable length, the 
succeeding instruction cannot be pre-decoded unless 
the pre-decode unit pre-decode the preceding instruc- 

50 tion. That is, the pre-decode unit can pre-decode only 
one instruction at a time. 

[0013] It has also been revealed by the inventors* 
study that there are three solving methods for the sec- 
ond problem. 

55 [0014] The first solution Is that a plurality of pre-de- 
code circuits for pre-decoding a plurality of instructions 
are connected in series. Herein, the succeeding pre-de- 
code circuit refers to the output of the preceding pre- 
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decode circuit. Moreover, the plurality of pre-decode cir- 
cuits are designed so as to be operable within one cycle. 
Then, the problem can be solved. In this case, however, 
the delay time of the p re-decode circuits connected In 
series becomes problematic. 
[001 S] The second solution is the method in which the 
pre-decode unit is endowed with a performance capable 
of pre-decoding one instruction in one cycle, whereupon 
the difference between the processing performances of 
the pre-decoder and the instruction decoder is absorbed 
by a buffer arranged between them. Since, however, the 
maximum throughput becomes one instmction/cycte 
with this method, the perfomiance of the microproces- 
sor is not considerably enhanced in spite of the fact that 
the microprocessor is specially permitted to execute the 
plurality of instructions at the other stage. 
[0016] The third solution Is that, as in the present in- 
vention, the succeeding instruction is decoded by plac- 
ing any assumption on the format of the preceding In- 
struction. 

[001 7] The present Invention has been made in prac- 
ticably realizing the third solution, and has for its object 
to provide a microprocessor which can decode a plural- 
ity of instructions at high speed and in parallel in case 
of processing a variable-length instruction set. 
[0018] Meanwhile, regarding the prior-art technique 
disclosed in the aforementioned official gazette of Jap- 
anese Patent Application Laid-open, the inventors' 
study has revealed the following disadvantage: In order 
to decode all the patterns of the unit instructions, a large 
number of instructions need to be decoded in parallel in 
the first and second instruction decoders, and one de- 
coded result need to be selected from among a large 
number of decoded results by the selectors. Therefore, 
the hardware quantities of the instruction decoders and 
the selectors become enomious. 
[0019] It is accordingly another object of the present 
invention to provide a microprocessor which can decode 
a plurality of Instructions at high speed and in parallel 
while restraining the quantity of its hardware to the min- 
imum. 

[0020] In order to accomplish the objects, according 
to the present invention, an instruction is decoded under 
the assumption of the instruction length thereof. 
[0021 ] Subsequently, when the assumption has been 
found correct by the decode of the instruction, the de- 
coded result of a succeeding instruction is also judged 
correct. To the contrary, when the assumption has been 
found erroneous, the decoded result of the succeeding 
instruction is judged en^oneous, and it is invalidated. 
[0022] Further, the assumptive instruction length 
should desirably be the length of the shortest instruction 
format in an instruction set. The reason is that the in- 
struction format which is the shortest In the variable- 
length instruction set corresponds to instructions of high 
frequence in use, so the assumption holds good at a 
high probability. 

[0023] Besides, in order to decode a plurality of in- 
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structions in parallel, an instruction prefetch unit trans- 
fers an instruction code whose length is at least double 
the shortest instruction fomiat, to an instruction decode 
unit. 

5 [0024] In the instruction decode unit, the instruction 
code is input to individual instruction decoders every 
length of the shortest instruction fonnat. Each of the in- 
struction decoders is capable of decoding, at least, the 
instructions having the shortest Instruction fonnat, and 

10 at least one of the i nstruction decoders is capable of de- 
coding all the instructions of the instruction set. It is also 
possible to hold the outputs of the respective instruction 
decoders in output latches different from one another. 
[0025] A microprocessor according to a typical em- 

'5 bodiment of the present invention is outlined as follows: 
[0026] The microprocessor is characterized by com- 
prising: 

(1) a fetch unit (lU) which fetches first and second 
20 instructions each having an instruction length of 

predetermined bit width (16 bits), from outside said 
microprocessor, and which delivers the first and 
second instructions to output lines in parallel, said 
output lines having a bit width (32 bits) that is at least 
25 double the predetemnined width; 

(2) a first instruction decoder (I DO) whose input is 
supplied with the first Instruction on said output lines 
of said fetch unit (lU); 

(3) a second instruction decoder (ID1) whose input 
30 is supplied with the second instruction on said out- 
put lines of said fetch unit (lU); 

(4) a control unit (PCNT) which is supplied with a 
decoded result (idO_out) of said first instruction de- 
coder and that (ld1_out) of said second instruction 

35 decoder; and 

(5) an instruction execution unit (EU) which re- 
sponds to an output from said control unit (PCNT); 

wherein under a condition under which the first In- 

40 structlon of the predetemnined instruction length is de- 
livered from said output lines having the bit width that is 
at least double the predetennlned width, said control 
unit (PCNT) responds to information on fulfillment of the 
condition in the decoded result (idO_out) of said first In- 

45 structlon decoder (I DO) and validates the decoded result 
(id1_out) of said second instruction decoder (ID1), so 
that said instruction execution unit (EU) executes the 
first instruction and the second instnjction in parallel in 
response to the decoded results (idO_out, id1_out) of 

50 said first and second instruction decoders transmitted 
as the output of said control unit; 

whereas under another condition under which an 
instruction having an instruction length different from the 
predetennlned bit width is delivered from said output 

55 lines of said fetch unit (lU), said control unit (PCNT) re- 
sponds to information on fulfillment of the other condi- 
tion in the decoded result (idO.out) of said first decoder 
(IDO) and invalidates the decoded result (id1_out) of 
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said second decoder (ID1), so that said instruction ex- 
ecution unit (EU) executes the first instruction in re- 
sponse to the decoded result (idO_out) of said first in- 
struction decoder (IDO) transmitted as the output of said 
control unit (PCNT). s 
[0027] It Is decided whether or not the Instruction 
codes processed by the instruction decoders con^e- 
spond to the instructions which can be decoded by the 
respective instruction decoders (that is, the instructions 
which have the shortest instruction fomriat). In a case 10 
where, as the result of the decision, any of the instruction 
decoders has decoded the Instruction having any differ- 
ent Instruction fonnat, the decoded results of the instruc- 
tion codes succeeding the particular Instruction are all 
Invalidated. The Invalidation Is readily realized using a 
control circuit. To the contrary, In a case where, as the 
result of the decision, all the instruction decoders have 
decoded the Instructions having the shortest instruction 
fonnat, all the decoded results are valid. On this occa- 
sion, the throughput of the instruction decode is the 
maximum, and the Instructions equal in number to the 
instruction decoders are processed in one cycle. 
[0028] Thus, the maximum throughput of the instruc- 
tion decode can be rendered two or more Instructions/ 
cycle though subject to the cases of the correct assump- 
tion, and the second problem stated before can be 
solved. Moreover, since the instruction length is as- 
sumed, the variable-length instruction need not be de- 
composed Into the fixed-length elements by the pre-de- 
code circuit, and the first problem stated before can be 30 
solved. 

[0029] In addition, according to the present invention, 
the second instruction decoder executes significant de- 
code concerning only the Instruction head code of the 
instruction (in other words, one sort of decode), and the 35 
insignificant decoded result of the second instruction de- 
coder is invalidated under any other condition (in other 
words, in case of a non-head code). Therefore, the plu- 
rality of instructions can be decoded at high speed and 
in parallel while the hardware quantity of the second In- 40 
struction decoder Is restrained to the minimum. 
[0030] Unlike the pre-decoding method hitherto 
known, the instruction decoding method of the present 
invention decodes an instruction under an erroneous 
assumption in a certain case. In this case, the decoded -^5 
result is invalidated as described above, and the 
throughput becomes one instruction/cycle. In this man- 
ner, the processing performance depends upon the in- 
struction fonnat more in the method of the present in- 
vention than in the pre-decoding method. This point can so 
be coped with In such a way that the instructions which 
have the fonnat fulfilling the assumption are used to the 
utmost in a program. 

[0031] Other objects and features of the present In- 
vention will become apparent from the ensuing descrip- S5 
tion of embodiments taken in conjunction with the ac- 
companying drawings. 
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BRIEF DESCRIPTION OF THE DRAWINGS: 
[0032] 

Fig. 1 shows a block diagram of a microprocessor 
which is an embodiment of the present invention; 
Fig. 2 shows the six sorts of instruction lengths of a 
variable-length instruction set which the microproc- 
essor of the embodiment has; 
Fig. 3 shows an example of the row of instructions 
in a memory as to the Instruction set of the embod- 
iment; 

Figs. 4(A) and 4(B) show the values of signal lines 
iO - 15 in the case where the microprocessor shown 
in Fig. 1 executes the instruction row in Fig. 3, as to 
two certain points of time; 
Fig. 5 shows a detailed arrangement diagram of a 
control circuit PCNT which is one of the constituents 
of the microprocessor in Fig. 1 ; and 
Fig. 6(A) shows the changes of control signals 
which are generated by Instruction decode In the 
case where the instruction row in Fig. 3 is executed 
by the microprocessor in Fig. 1 , while Fig. 6(B) 
shows the changes of control signals In the case of 
employing an architecture in which the microproc- 
essor In Fig. 1 includes only one instruction decod- 
er. 

DESCRIPTION OF THE PREFERRED 
EMBODIMENTS: 

[0033] Fig. 1 is a block diagram of a microprocessor 
to which the present invention is applied. The present 
invention makes it possible to decode a plurality of in- 
structions In parallel. Here will be described the internal 
architecture and operation of the microprocessor which 
decodes two Instructions in parallel as the simplest as- 
pect of the parallel decode of the plurality of Instructions. 

Intemal Architecture of Microprocessor 

[0034] First, the internal architecture of the microproc- 
essor will be described with reference to Fig. 1 . The mi- 
croprocessor in Fig. 1 is basically constructed of an in- 
terface unit lOU, an instruction prefetch unit lU, an in- 
struction decode unit DU and an execution unit EU. 
These units are capable of parallel operations, and pipe- 
line processing Is perfomned under the control of the in- 
struction decode unit DU. 

Interface Unit lOU 

[0035] The microprocessor is connected with external 
devices (for example, a main memory) through the in- 
terface unit lOU. This interface unit lOU transfers both 
instructions and data between the microprocessor and 

the main memory. 

[0036] More specifically, an instruction fetched from 
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the main memory is transferred from the interface unit 
lOU to the instruction prefetch unit lU through signal 
lines having a width of 64 bits. 
[0037] On the other hand, data computed by the ex- 
ecutlon unit EU is transferred from this execution unit 
EU to the Interface unit lOU through signal lines in two 
sets each consisting of 32 bits, wlille data fetched from 
the main memory is transferred from the interface unit 
lOU to the instruction decode unit DU. 

Instruction Prefetch Unit lU 

[0038] The instruction prefetch unit I U has a prefetch 
queue PFQ. The instructions transferred from the inter- 
face unit lOU are once latched in the prefetch queue 
PFQ and aligned in 16-bit unit, whereupon the aligned 
instructions are delivered to the instruction decode unit 
DU. The prefetch queue PFQ is a queue of FIFO (First- 
In First-Out). 

[0039] The instructions after the alignment are trans- 
ferred from the instruction prefetch unit lU to the instruc- 
tion decode unit DU through six sets of 1 6-bit signal lines 
10 - 15. Here, the signal line 10 bears the head code of 
the instruction to be decoded in the next machine cycle, 
and the signal lines i1 - 15 bear the row of the instructions 
succeeding the Instmctlon of the signal line 10. The sig- 
nal line iO is connected to a first instruction decoder IDC. 
Similarly, the signal line i1 is connected to a second in- 
struction decoder ID1. It is the feature of the embodi- 
ment of the present invention thatthe Input of the second 
instruction decoder ID1 is uniquely determined by the 
signal of the signal line 11 and is not selected from 
among the signals of the signal lines 11 -iS. Besides, the 
first instruction decoder IDG has the function of decoding 
all instructions which can be processed by the micro- 
processor. In contrast, the second Instruction decoder 
ID1 can decode only instructions In an Instruction fomnat 
having a length of 1 6 bits or 32 bits, among the instruc- 
tions which the microprocessor can execute. The de- 
coded results of the instructions in the first Instruction 
decoder IDO and the second Instruction decoder ID1 are 
respectively delivered to signal lines idO.out and 
id1_out and then sent to a pipeline control unit PCNT 

Pipeline Control Unit PONT 

[0040] The pipeline control unit PCNT generates con- 
trol signals for the units lOU, lU and EU on the basis of 
the signals of the signal lines idO_out and idl.out and 
signals (not shown in the figure) indicating the statuses 
of these units lOU, lU and EU. 

Expansion Part Generator EG 

[0041] In addition, the instruction decode unit DU in- 
cludes an expansion part generator EG, by which im- 
mediate data or displacement data In the instructions is 
expanded to 32 bits and then delivered. The position 
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and length of the immediate data or displacement data 
in any instruction are designated in the operation code 
of the instruction, and the data is obtained by decoding 
the operation code. The expansion part generator EG 

5 processes the data on the basis of the designation, and 
delivers the processed data to a bus dO or d1 . The rea- 
son why the expansion part generator EG has two sets 
of 32-bit output lines, is that the data items are trans- 
ferred independently under the respective controls of 

10 the first instruction decoder IDO and the second Instruc- 
tion decoder ID1. 

Execution Unit EU 

15 [0042] Two integral arithmetic logic units ALU are sim- 
ilarly disposed in the execution unit EU so as to corre- 
spond to th e first instruction decoder I DO and the second 
instruction decoder ID1, respectively. 



[0043] A register file RF in the instruction decode unit 
DU is configured of sixteen 32-bit registers RO thru R1 5. 
Each of the registers has four read ports and two write 

25 ports, totaling six ports. Among these ports, one half 
(two read ports and one write port) con'esponds to the 
side of the first Instruction decoder IDO and Is connected 
to the first arithmetic logic unit ALUO. Likewise, the ports 
of the other half correspond to the side of the second 

30 instruction decoder ID1 and are connected to the sec- 
ond arithmetic logic unit ALU1 . 

Signal Lines of 32-blt Width 

35 [0044] The instruction decode unit DU and the execu- 
tion unit EU are connected by six sets of signal lines dO, 
d1, d2, d3, eO and e1 each having a width of 32 bits. 
Among them, the four sets (dO, d1 , d2, d3) are used for 
transferring data from the instruction decode unit DU to 

40 the execution unit EU, while the remaining two sets (eO, 
e1 ) are used for transferring data from the execution unit 
EU to the instruction decode unit DU. 
[0045] By way of example, let's consider a case where 
the first arithmetic logic unit ALUO processes the instruc- 

45 tion of adding the values of the registers RO and R1 and 
then setting the sum in the register R1 . In this case, the 
values of the registers RO and R1 are first read out from 
the register file RF and respectively delivered to the 
32-bit signal lines dO and d1 . At the next execution stage 

50 to the instruction decode unit DU in the pipeline, that is, 
in the execution unit EU, the first arithmetic logic unit 
ALUO receives the values from the signal lines dO and 
d1 and adds them up. The result of the addition is de- 
livered to the signal lines eO. Further, at the stage of reg- 

ss ister store which Is the next processing, the processing 
proceeds In the Instruction decode unit DU again, and 
the value on the signal lines eO is set in the register R1 
within the register file RF. The above is an operation em- 
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ploying the side of the first instruction decoder IDC. In 
case of employing the side of the second instruction de- 
coder ID1 , the signal lines d2, d3 and e1 and the second 
arithnnetic logic unit ALU1 are used. More specifically, 
the values of the registers RO and R1 are respectively 
delivered to the signal lines d2 and d3, and they are add- 
ed by the second arithmetic logic unit ALU1 . Thereafter, 
the result of the addition is transferred to the register R1 
by the use of the signal lines e1 . 
[0046] In the case of transferring data between the mi- 
croprocessor and the memory, signal lines In two sets 
each consisting of 32 bits as laid between the signal 
lines eO, e1 and the interface unit lOU are used. Since 
the operation of this part is not directly relevant to the 
present invention, It shall be omitted from description. 
[0047] The effect of the present invention is that the 
parailel decode of a plurality of instructions becomes 
possible. In this embodiment, the microprocessor hav- 
ing the instruction set of variable-length instructions will 
be taken as an example. Therefore, what the variable- 
length instruction is will be first explained briefly. 

Variable-length Instruction 

[0048] In short, the "variable-length instruction" 
means an instruction which has a plurality of instruction 
fomiats and whose length changes when the different 
instruction formats are taken. In other words, an instruc- 
tion set including any instruction of different length has 
the instruction of variable length. 

Fixed-length Instruction 

[0049] In contrast, a case where all instructions have 
a fixed length is generally called an "instruction set of 
fixed length". 

Instruction Set of This Embodiment 

[0050] As shown in Fig. 2, this embodiment assumes 
the set of Instructions which have six sorts of lengths of 
16 bits thru 96 bits in 16-bit unit. In the memory, the In- 
structions are located bordering every 1 6 bits. That is, 
the 16-bit elements of the instructions are all located at 
addresses of even-numbered bytes. This situation is il- 
lustrated In Fig. 3. 

[0051] Next, the operation of the parallel decode of 
instructions in this embodiment will be described. 
[0052] Fig. 3 shows one example of the row of instruc- 
tions in the memory. The individual instructions are in- 
dicated as, for example, insto and inst1 . The Instruction 
whose length exceeds 16 bits is indicated as, for exam- 
ple, inst2_0 and inst2_1 by further affixing lower bars 
and numerals. That Is, the instruction longerthan 1 6 bits 
Is divided Into a plurality of elements. It is also assumed 
that a code which must be subjected to decode process- 
ing in each instruction is limited to the head code of the 
instruction. In other words, it is assumed that the non- 



head code of each instnjctlon is immediate data or dis- 
placement data. In the case of the instruction inst2 by 
way of example, the first code inst2_0 needs to be de- 
coded, but the succeeding code inst2_1 need not be de- 
5 coded. 

[0053] Under the above premises. Figs. 4(A) and 4(B) 
show the statuses of the 1 6-bit signal lines iO - i5 at two 
certain points of time, the signal lines constituting the 
transfer bus from the instruction prefetch unit lU to the 

10 instruction decode unit DU. Fig. 4(A) illustrates the sta- 
tuses in which the instruction row in Fig. 3 has already 
been accepted in the prefetch queue PFQ of the instruc- 
tion prefetch unit lU, and in which the first instruction 
insto is about to be decoded. In the first half of the next 

f5 machine cycle, the first instruction instO is decoded by 
the first instruction decoder IDG, and the succeeding in- 
struction insti by the second Instruction decoder ID1 . 
As the results of the decoding, it is found that the two 
Instructions instO and tnsti are both in the instruction 

20 format having the shortest length. Then, a command is 
issued from the instruction decode unit DU to the in- 
struction prefetch unit lU so as to advance the pointer 
of instructions to the amount of 32 bits. In consequence, 
after a further half machine cycle, the signal lines 10 - 15 

25 between the instruction prefetch unit lU and the instruc- 
tion decode unit DU fall into the statuses shown In Fig. 
4(B) In which the two instructions instO and insti have 
been taken away and in which the Instructions InstS and 
inst6 are added instead. On this occasion, the instruc- 

30 tion code inst2_0 Is decoded by the first decoder IDO, 
and the instruction code inst2_1 by the second decoder 
ID1. As the decoded result of the instruction code 
inst2_0 in the first decoder IDO, it is found that the in- 
struction inst2 is not in the instruction fonmat having the 

35 shortest length. 

[0054] In a case where the shortest instruction is input 
to the first instruction decoder IDO, the head operation 
code of the next instruction is input to the second in- 
struction decoder ID1 . The second Instruction decoder 

40 ID1 decodes the Instruction, assuming such Input of the 
head operation code of the next instruction. Therefore, 
in a case where the instruction decoded In the first In- 
struction decoder IDO is a non-shortest Instruction, the 
instruction decode in the second instmction decoder ID1 

^5 Is judged erroneous. The judged result of the error is 
reflected in the output Id0_out of the first instruction de- 
coder IDO, and invalidation processing is performed in 
the pipeline control unit PCNT in response to this judged 
result. As shown in Fig. 1 , the decoded results, namely, 

50 the outputs idO_out and id1_out are sent from the first 
and second Instruction decoders IDO, ID1 to the pipeline 
control unit PCNT The output idO_out contains informa- 
tion indicating whether or not the instruction decoded in 
the first decoder IDO is In the Instruction fonnat of the 

55 shortest length. On the other hand, the output id1_out 
may well contain infonnation indicating that the instruc- 
tion which the second decoder ID1 cannot decode has 
been input. In this embodiment, however, it Is supposed 
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that such information is not contained in the output 
id1_out. 

[0055] The decoded result of the second instruction 
decoder ID1 must be invalidated in confomnity with the 
infomriation contained in the output idO_out as indicates 
that the length of the instruction having been input to the 
first instruction decoder IDG is not the shortest or 1 6 bits. 
The processing for the invalidation is performed by the 
pipeline control unit PCNT as stated above. 

Detailed Block Diagram of Pipeline Control Unit PCNT 

[0056] Fig. 5 shows a detailed block diagram of the 
pipeline control unit PCNT 

[0057] The pipeline control unit PCNT is configured of 
a pipeline stage control unit Pipe_CNTL, a selector SEL 
and a no-operation command unit NOP, and it controls 
the pipeline operation of the whole microprocessor on 
the basis of the outputs idO_out, id1_out and the status- 
es of the respective units (lU, DU, EU, lOU). The 
processing stages in the pipeline processing are con- 
trolled by the pipeline stage control unit Plpe_CNTL of 
the pipeline control unit PCNT in Fig. 5. Besides, the 
invalidation processing for the output infomriation of the 
second instruction decoder ID1 is performed on this side 
of the pipeline stage control unit Pipe_CNTL. 
[0058] M ore specifically, the output idl _out of the sec- 
ond Instruction decoder 101 Is invalidated as follows: 
This output id1_out of the second instruction decoder 
ID1 is supplied to one input of the selector SEL. In this 
embodiment, another input of the selector SEL Is sup- 
plied with a fixed value NOP through not especially re- 
stricted. The fixed value NOP has quite the same fields 
as those of the output id1_out, and affords a non-exe- 
cution command instruction called "no operation". The 
value NOP may be either identical to or different from 
the decoded Infonriation of an "nop" instruction which is 
generally employed as the instruction for commanding 
no operation. Necessary is that the instruction NOP 
commands no operation, and the size of data to be han- 
dled, for example, may be designated to any value. The 
selection of either of the value NOP and the output 
ld1_out In the selector SEL is done In accordance with 
the information id1_valid which Is contained in the out- 
put idO_out being the decoded result of the first instruc- 
tion decoder IDC and which indicates whether or not the 
full length of the instruction decoded by the first Instruc- 
tion decoder IDG is 1 6 bits. In a case where the Instruc- 
tion length is 16 bits, the output ld1_out is selected. To 
the contrary, in a case where the instruction length ex- 
ceeds 16 bits, the value NOP is selected. In this way, 
pipeline control signals pcntO and pcnti are obtained. 
[0059] Let's suppose the execution of the instruction 
row In Fig. 3 again. The changes of the pipeline control 
signals pcntO and pcnti on this occasion are shown in 
Fig. 6(A). It should be noted that, unlike Fig. 3, Fig. 6(A) 
represents time in the vertical direction thereof. By way 
of example, when the statuses in Fig. 4(A) shift into the 



statuses in Fig. 4(B), the instructions instO and Insti are 
decoded. This situation is indicated at the uppennost 
line in Fig. 6(A). In the next machine cycle, the instruc- 
tion codes inst2_0 and inst2_1 are decoded, and the de- 

5 coded result of the instruction code inst2_G and the fixed 
value NOP are respectively delivered as the signals 
pcntO and pcnti . Thenceforth, the execution proceeds 
similarly, and the instructions instO thru InstB are sub- 
jected to the decode processing in 4 machine cycles. 

10 [0060] Shown in Fig. 6(B) are the changes of the con- 
trol signal pcntO In the case of the prior art where 
processing similar to the above is perfonned using only 
the first Instruction decoder IDG. In this case of the prior 
art, 7 machine cycles are required for the decode 

15 processing of the Instructions instO thru inst6 as illus- 
trated in Fig. 6(B). 

[0061] Thus, In this embodiment, an instruction de- 
coding capability double higher is attained at the peak, 
and a capability equal to one attained with the single 

20 Instruction decoder is attained even In the worst case. 
[0062] Now, the processing of the instructions instO, 
tnsti and inst2_0, inst2_1 will be described as to more 
practicable examples. As the examples, it is assumed 
that the Instruction instO is the fixed-length instruction of 

25 adding the values of the registers RO and R1 and then 
setting the result in the register R1 , that the instruction 
insti is the fixed-length instruction of adding the values 
of the registers R2 and R3 and then setting the result in 
the register R3, and that the instruction inst2 Is the var- 

30 iable-length instruction of adding displacement data to 
the value of the register R4 to obtain an address and 
then fetching the data of the address from the main 
memory and setting it in the register R5. Here, the in- 
struction code Inst2_0 is the operation code, and the in- 

35 struction code inst2_1 is the displacement data. 

[0063] First, the processing of the instructions instO 
and insti will be described. 

[0064] The two instructions InstO and insti are deliv- 
ered to the 96-bit signal tines laid from the prefetch 

40 queue PFQ, in the mannershown in Fig. 4(A). Then, the 
instruction instO is decoded by the first instruction de- 
coder IDO, while the instruction insti is decoded by the 
second instruction decoder ID1 . 
[0065] In this case, it is decided as the result of the 

45 decoding of the instruction InstO that this instruction 
instO is the shortest Instruction. The result of the deci- 
sion Is indicated by asserting the signal idl .valid in the 
decoded result id0_out. The outputs Id0_out and 
idl _out are respectively delivered as the control signals 

50 pcntG and pcnti through the pipeline control unit PCNT 
described before. Subsequently, an operation to be stat- 
ed below is perfomned by the commands of these control 
signals. 

[0066] The value of the register RO and that of the reg- 
53 ister R1 are respectively delivered to the signal lines dO 
and d1 in accordance with the command of the control 
signal pcntO. Simultaneously, the value of the register 
R2 and that of the register R3 are respectively delivered 
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to the signal lines d2andd3in accordance with the com- 
mand of the control signal pcntl. Subsequently, the 
arithmetic logic unit ALUO adds the values of the signal 
lines do and d1 and delivers the sum to the signal lines 
eO, while the arithmetic logic unit ALU1 adds the values 
of the signal lines d2 and d3 and delivers the sum to the 
signal lines e1 . Further, at the succeeding stage of reg- 
ister store, the value of the signal lines eO is set in the 
register R1 , and the value of the signal lines e1 in the 
register R3. 

[0067] Next, the processing operation of the instruc- 
tion inst2 will be described. 

[0068] The instruction inst2 Is delivered to the 96-bit 
signal lines laid from the prefetch queue PFQ, in the 
manner shown in Fig. 4(B). Then, the instmction code 
inst2_0 Is decoded in the first instruction decoder IDC, 
while the instruction code inst2_1 is decoded in the sec- 
ond instruction decoder ID1 under the assumption that 
it is the head code of the next instruction. 
[0069] in this case, it is decided as the result of the 
decoding of the instruction code inst2_0 that the instruc- 
tion inst2 is a non-shortest instruction. The result of the 
decision Is Indicated by negating the signal id1_vaild in 
the decoded result idO.out. The output idO.out Is deliv- 
ered as the control signal pcntO through the pipeline 
control unit PCNT described before. Simultaneously, 
since the signal id1_valid is negated, the Instruction 
NOP commanding no operation Is selected by the se- 
lector SEL in the pipeline control unit PCNT and Is de- 
livered as the control signal pcntl . Subsequently, an op- 
eration to be stated below is performed by the com- 
mands of these control signals. 
[0070] The value of the register R4 is delivered to the 
signal lines dO In accordance with the command of the 
control signal pcntO. Also, the displacement data 
Inst2.1 of 1 6 bits is expanded into 32 bits by the expan- 
sion part generator EG, and the expanded data Is deliv- 
ered to the signal lines d1 . 

[0071 ] Besides, since the command of the control sig- 
nal pcntl is the value NOP, any output is not especially 
delivered to the signal lines 62 and d3. Subsequently, 
the integral arithmetic logic unit ALUO adds the values 
of the signal lines dO and d1 (for calculating the address) 
and delivers the sum to the signal lines eO. The com- 
mand for the arithmetic logic unti ALU 1 is also the value 
NOP, and any output is not especially delivered to the 
signal lines e1. 

[0072] Further, at the succeeding stage, in accord- 
ance with the command of the control signal pcntO, that 
address of the main memory which is specified by the 
value of the signal lines eO is accessed to fetch an op- 
erand, and the fetched data is set in the register R5. 
Since the commands of the control signal pcntl for the 
interface unit lOU and the instruction decode unit DU 
(register store) are the value NOP, the main memory is 
not accessed, and any value is not transferred or set to 
or in any register from the signal lines e1 , either. 
[0073] According to this embodiment, the throughput 
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of the processing of the whole microprocessor is en- 
hanced, and CP I (the number of machine cycles re- 
quired for executing one instruction) can be rendered 
less than one. 

5 [0074] Moreover, a plurality of instruction decoders 
may Include only one instruction decoder capable of de- 
coding all the instruction fomnats. The remaining instruc- 
tion decoders may have merely the function of decoding 
the shortest instruction fomnat. Therefore, the decoding 

10 of a plurality of instructions can be realized with a small 
quantity of hardware. This merit results also in reducing 
the quantities of processing for testing and diagnosing 
the microprocessor and in shortening the time periods 
of the processing. 

15 [0075] Besides, an instruction code to be input to the 
plurality of instruction decoders are uniquely divided by 
the length of the shortest instruction fonnat, and the re- 
sulting elements are input to the respective instruction 
decoders. That is, the inputs of all the instruction decod- 

20 ers are selected with ease. This merit is useful for the 
realization of a high speed together with the suppression 
of the quantity of hardware. 

[0076] The embodiment of the present Invention is al- 
so applicable to a microprocessor which has a fixed- 

25 length instruction set. More specifically, most of the plu- 
rality of instruction decoders are penmitted to decode on- 
ly instructions of high frequence in use, whereby the in- 
struction decoders for processing the plurality of Instruc- 
tions in parallel, which have a small quantity of hardware 

30 and which operate at high speed, can be realized. 
[0077] In addition, Irrespective of the fixed-length in- 
struction set and the variable-length instruction set, in- 
structions which each instruction decoder is capable of 
decoding can be determined in correspondence with a 

35 circuit which the instruction decoder controls. By way of 
example, an instruction decoder for controlling an arith- 
metic logic unit is capable of decoding only arithmetic 
logic instoictions, and for any other instruction, It pro- 
duces a result Indicating that it has failed to decode the 

40 Instruction. This measure brings forth the effect that the 
number of signal lines to be laid from the instruction de- 
coder to the controlled circuit decreases. 
[0078] The present invention makes it possible to de- 
code a plurality of fixed-length Instructions in parallel In 

45 a variable-length instruction set. As compared with the 
prior-art method, accordingly, the invention enhances 
the maximum throughput of an instruction decoding per- 
formance. 



1 . A microprocessor comprising: 

55 a fetch unit (iU) which fetches at least one in- 

struction from outside of said microprocessor, 
and which delivers said at least one instruction 
to output lines (10-15), said output lines (i0-i5) 
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having a bit width that is at least double a pre- 
determined bit width; 

a first instruction decoder (IDO) whose input is 
supplied with a first output which is delivered 
from said fetch unit on one of said output lines s 
(iO); 

a second Instruction decoder (ID1 ) whose input 
is supplied with a second output which is deliv- 
ered from said fetch unit on the other one of 
said output lines (i1); io 
a control unit (PChfT) which is supplied with a 
first decoded result (Id0_out) of said first in- 
struction decoder (IDO) and a second decoded 
result (id1_out) of said second instruction de- 
coder (ID1); and '5 
an instruction execution unit (EU) which re- 
sponds to an output from said control unit (PC- 
NT): 

characterized in that said first instruction de- 
coder decodes a predetemnined set of instructions 
executed in said Instruction execution unit, and said 
second instruction decoder decodes a part of said 
predetemnined set of instructions; 

that said control unit (PCNT) includes a seiec- 
tor (SEL), a first input and a second input of which 
are respectively supplied with said second decoded 
result {ld1_out) of said second instruction decoder 
(ID1) and a non-execution command Instruction 
(NOP), and a control input of which is supplied with 30 
an information (id1_valld) indicating whether or not 
said first output is an instruction having an instruc- 
tion length of said predetermined bit width; 

that, when each of said first and second out- 
puts is an instruction having an instruction length of 35 
said predetemnined bit width, said selector (SEL) 
transmits said second decoded result in response 
to said information, so that said instruction execu- 
tion unit (EU) executes said first output and said 
second output in parallel in response to said first 40 
decoded result and said second decoded result, 
and 

that, when said first output Is a part of an in- 
struction having an instruction length longer than 
said predetennined bit width, said selector (SEL) 
transmits said non-execution command instruction 
(NOP) in response to said infonnation in order to 
invalidate the second decoded result, so that said 
instruction execution unit (EU) executes said first 
output in response to said first decoded result. so 

2. A microprocessor according to claim 1 , wherein, 
when said control unit (PCNT) invalidates the de- 
coded result (idl.out) of said second Instruction de- 
coder (ID1 ), said instmctlon execution unit (EU) de- ss 
tenrtines an address of an operand in response to 
that bit information of said output lines of said fetch 
unit (lU) which corresponds to a bit position of the 
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invalidated decoded result of said second instruc- 
tion decoder. 

3. A microprocessor according to claim 1 , wherein the 
predetermined bit width is the shortest instruction 
length. 



PatentansprQche 

1 . Mllcroprozessor mit: 

einer Abrufeinheit (lU), diemindestens eine An- 
weisung von auBerhalb des Mikroprozessors 
abruft, und die mindestens eine Anweisung an 
Ausgabeleitungen (i0-i5) ubertragt, die eine 
Bitbreite aufweisen, die mindestens das Dop- 
pelte einer vorgegebenen Bitbreite betragt; 
einem ersten Anweisungsdekoder (IDO), des- 
sen Eingabe von einer ersten, von der Ab- 
rufeinheit auf eine der Ausgabelinlen (iO) iiber- 
tragenen Ausgabe geliefert wird; 
einem zweiten Anweisungsdekoder (ID1), des- 
sen Eingabe von einer zweiten, von der Ab- 
rufeinheit auf eine andere der Ausgabelinlen 
(i1) ubertragenen Ausgabe geliefert wird, 
einer Steuereinheit (PCNT), der ein erstes de- 
kodiertes Ergebnis (idO_out) des ersten Anwei- 
sungsdekoders (IDO) und ein zweites dekodier- 
tes Ergebnis (id1_out) des zweiten Anwei- 
sungsdekoders (iD1) geliefert wird; und 
einer Anwelsungsausfuhreinheit (EU), die auf 
die Ausgabe der Steuereinheit (PCNT) rea- 
giert; 

dadurch gekennzelchnet, 

daB der erste Anweisungsdekoder einen vor- 
gegebenen, in der Anwelsungsausfuhreinheit aus- 
zufuhrenden Anweisungssatz dekodlert, und der 
zweite Anweisungsdekoder einen Tell des vorgege- 
benen Anweisungssatzes dekodiert; 

daB die Steuereinheit (PCNT) eine Auswahl- 
vomchtung (SEL) umfasst, dessen erste und zweite 
Eingabe von jeweils dem zweiten dekodierten Er- 
gebnis (id1_out des zweiten Anweisungsdekoders 
(ID1) und einer Nulloperations-Befehlsanweisung 
(NOP) geliefertwerden, und dessen Steuereingabe 
von einer Infonnation (id1_valid geliefert wird, die 
anzeigt, ob oder ob nicht die erste Ausgabe eine 
Anweisungsbreite von der vorgegebenen Bitiareite 
aufweist; 

daB, wenn jeweils die erste und zweite Aus- 
gabe eine Anweisung mit einer Anweisungslange 
von der vorgegebenen Bitbreite ist, die Auswahlvor- 
rlchtung (SEL) als Reaktlon auf die genannte Infor- 
mation das zweite dekodlerte Ergebnis ubertragt, 
so daB die Anwelsungsausfuhreinheit (EU) als Re- 
aktlon auf das erste und das zweite dekodierte Er- 
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gebnis die erste und zweite Ausgabe parallel aus- 
fuhrt, und 

daB, wenn die erste Ausgabe Teil einer An- 
weisung mit einer Anweisungslange ist, die (anger 
als die vorbestimmte Bitbreite Ist, die Auswahlvor- s 
richtung (SEL) als Reaktion auf die genannte Infor- 
nnatlon die Nulloperations-Befehisanweisung 
(NOP) iibertragt, um das zweite dekodierte Ergeb- 
nis ungQItig zu machen, so daB die Anweisungsaus- 
fuhreinheit (EU) als Reaktion auf das erste deko- 10 
dierte Ergebnis die erste Ausgabe ausfuhrt. 

2. Mikroprozessor nach Anspruch 1 , wobei, wenn die 
Steuereinheit (PCNT) das dekodierte Ergebnis 
(id1_out) des zweiten Anweisungsdekoders (ID1) is 
ungultig macht, die Anweisungsausfuhreinheit (EU) 

als Reaktion auf diejenige Bitinfonnation der Aus- 
gabeleitungen der Abrufeinheit (lU), die einer Bit- 
position des ungultig gemachten dekodierten Er- 
gebnisses des zweiten Anweisungsdekoders ent- 20 
spricht, eine Operandenadresse feststellt. 

3. Mikroprozessor nach Ansprucli 1 , wobei die vorge- 
gebene Bitbreite die kQrzeste Anweisungsldnge ist. 

25 

Revendications 

1 . Microprocesseur comportant ; 

une unit^ de lecture (lU) qui lit au moins une 
instruction depuis I'ext^rieur dudit processeur, 
et qui d^livre ladite au moins une instruction k 
des lignes de sortie (iO k \5), iesdites llgnes de 
sortie (10 k 15) ayant une largeur binaire qui est 
au moins le double d'une largeur binaire pr6d6- 
temiin^e, 

un premier d^odeur d'instructions (IDO) dont 
i'entr^e est allment^e par une premiere sortie 
qui est ddlivr^e par ladite unit6 de lecture sur 
i'une desdites lignes de sortie (10), 
un second d6codeur d'instmctions (ID1) dont 
rentr6e est aliment6e par une seconde sortie 
qui est d^livr^e par ladite unit6 de lecture sur 
I'autre ligne panni Iesdites lignes de sortie (i1), 
une unit6 de commande (PCNT) qui est alimen- 
t^e par un premier r^sultat d6cod§ (idO_out) 
dudit premier d6codeur d'instructions (IDO) et 
un second r^sultat dteod^ (idl.out) dudit se- 
cond d^codeur d'instructions (ID1), et 
une unit6 d'ex^cution d'instructions (EU) qui r6- 
pond k une sortie provenant de ladite unit6 de 
commande (PCI^iT), 

caracterise en ee que tedit premier d^odeur 

d'instructions decode un jeu pr^d^termind d'ins- 
tructions ex6cutd dans ladite unit^ d'ex^cution 
d'instmctions, et ledit second d6codeur d'instruc- 



tions d^ode une partie dudit jeu pr^d6tennin6 
d'instructions, 

en ce que ladite unit6 de commande (PCNT) 
inclut un s^iecteur (SEL). une premiere entree et 
une seconde entree qui sont respectivement ali- 
ment^es par ledit second r^sultat d^od6 (id1_out) 
dudit second dteodeur d'instructions (ID1) et une 
instruction de commande ineffective (NOP), et une 
entree de commande qui est aliment^e par des in- 
formations (id1 .valid) indiquantsi oul ou non ladite 
premi&re sortie est une instruction ayant une lon- 
gueur d'instruction 6gale k ladite largeur binaire 
pr6d6tennin6e, 

en ce que, lorsque chacune desdites premie- 
re et seconde sorties est une instruction ayant une 
longueur d'instruction 6gale k ladite largeur binaire 
pr6d6temiin6e, ledit s6lecteur (SEL) transmet ledit 
second r§sultat d^cod^ en r^ponse auxdites infor- 
mations, de sorte que ladite unit6 d'ex^cution d'ins- 
tructions (EU) ex^ute ladite premiere sortie et la- 
dite seconde sortie en parall^le en r6ponse audit 
premier r^sultat d6cod6 et audit second rdsultat d6- 
cod6, et 

en ce que, lorsque ladite premiere sortie est 
une partie d'une instruction ayant une longueur 
d'instruction plus longue que ladite largeur binaire 
pr6d6temiin6e, ledit s6lecteur (SEL) transmet ladi- 
te instruction de commande ineffective (NOP) en r6- 
ponse auxdites informations afin d'invalider le se- 



30 cond r^sultat d4cod6, de telle sorte que ladite unit6 
d'ex6cution d'instructions (EU) ex6cute ladite pre- 
miere sortie en r^ponse audit premier r^suitat de- 
code. 

35 2. Microprocesseur selon la revendication 1 , dans le- 
quel, lorsque ladite unite de commande (PCNT) in- 
valide le resultat decode (id1_out) dudit second de- 
codeur d'instructions (ID1), ladite unite d'execution 
d'instructions (EU) detennine une adresse d'un 
40 operande en reponse k ces infomiatlons binaires 
desdites lignes de sortie de ladite unite de lecture 
(lU) qui correspond k une position binaire du resul- 
tat decode invalide dudit second decodeur d'ins- 
tructions. 

45 

3. Microprocesseur selon la revendication 1 , dans le- 
quei ia largeur binaire predetemninee est la plus 
courte longueur d'instruction. 

50 
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